Multi-variate model of T cell clonotype competition and homeostasis

Diversity of the naive T cell repertoire is maintained by competition for stimuli provided by self-peptides bound to major histocompatibility complexes (self-pMHCs). We extend an existing bi-variate competition model to a multi-variate model of the dynamics of multiple T cell clonotypes which share stimuli. In order to understand the late-time behaviour of the system, we analyse: (i) the dynamics until the extinction of the first clonotype, (ii) the time to the first extinction event, (iii) the probability of extinction of each clonotype, and (iv) the size of the surviving clonotypes when the first extinction event takes place. We also find the probability distribution of the number of cell divisions per clonotype before its extinction. The mean size of a new clonotype at quasi-steady state is an increasing function of the stimulus available to it, and a decreasing function of the fraction of stimuli it shares with other clonotypes. Thus, the probability of, and time to, extinction of a new clonotype entering the pool of T cell clonotypes is determined by the extent of competition for stimuli it experiences and by its initial number of cells.

An adult human has approximately 4 × 10 11 T cells 1 , each of them expressing about 3 × 10 4 identical T cell receptors (TCRs) on its surface 2 .These receptors recognise self-peptides bound to major histocompatibility complexes (MHCs), which as bound binary complexes are called self-pMHCs.The interaction between TCRs and self-pMHCs induces a T cell to synthesise proteins important for survival and homeostatic proliferation [3][4][5][6] .While the number of distinct TCRs in the naive T cell repertoire 7 is at least 10 7 -10 8 , the total number of T cells is such that most TCRs are present on many different T cells.These sub-populations of T cells sharing the same TCR are called T cell clonotypes.If we consider a single naive T cell clonotype with relatively little competition for self-pMHCs with other clonotypes, then its population dynamics can be modelled as a uni-variate birth and death process 8 .Thus, for these TCR clonotypes, self-pMHC stimulation promotes establishment in the periphery [9][10][11][12] .Competition implies that clonotypes are susceptible to extinction.
TCRs are inherently cross-reactive: one TCR can interact with many different pMHCs 13 .Individual TCRs have been estimated to recognize 10 6 different pMHCs 14 , suggesting an overlap in the sets of self-pMHCs that stimulate different T cell clonotypes.Without this overlap, extinction of a clonotype would decrease the coverage of the TCR repertoire over the space of foreign peptides, which is known to be maintained even in the presence of such extinction events 15,16 .
A similar mathematical model can be used for two clonotypes which compete for self-pMHC survival stimuli.In this case, a bi-variate Markov competition process can be defined as in Ref. 17 .This bi-variate model can be used to show that extinction, for sufficiently late times, is certain for both clones, i.e., after a transient time one clonotype will become extinct and the remaining one will behave as described by the uni-variate model 8 .This is a closer representation of the competition for survival stimuli experienced by the naive T cell repertoire 11,12 .However, the highly oligoclonal nature of immune responses 18 and the occurrence of similar TCRs 19 serve as evidence that the self-pMHC recognition profile overlap will typically extend to more than two clonotypes.
Here, we propose a generalisation of the model presented in Ref. 17 to characterise the competition of a number, η , of different T cell clonotypes ( η ≥ 3 ) with significant self-pMHC recognition overlap.Significant in this context means that the number of self-pMHCs shared by the clonotypes under consideration is so large that they cannot be modelled as single clonotypes 8 , nor as pair-wise competitors 17 .It is assumed that naive T cells of a

Stochastic multi-variate model of naive T cell clonotype competition for self-pMHC stimuli
Let us consider two sets: the set, C , of η different clonotypes with a significant overlap in the self-pMHCs they recognise (see Fig. 1 of Ref. 25 ), and Q , the set of all self-pMHCs which can stimulate clonotypes in C .We will describe the number of T cells belonging to each of the η clonotypes at time t as a continuous time multi-variate Markov process, X = (X 1 (t), . . ., X η (t)) : t ≥ 0 , over the state space S = (n 1 , . . ., n η ) : n i ≥ 0, ∀i = N η 0 , where X i (t) represents the number of cells of clonotype i at time t (for 1 ≤ i ≤ η ), and X(t) = X 1 (t), . . ., X η (t) is the random vector describing the population of η clonotypes being modelled at time t.
We assume that all cells of a particular clonotype exit the thymus at roughly the same time.However different clonotypes can exit the thymus at different times.Since we are interested in modelling the competition dynamics of all clonotypes in C , we consider the initial time t = 0 in our process X so that all clonotypes in C are already present in the periphery.
The division rate of cells belonging to a clonotype depends on the competition between clonotypes for shared self-pMHC stimuli 11,12,25,26 .To this end we consider a tripartite recognition network (see Fig. 1).In a recognition network each clonotype (green circle) is able to receive stimuli from a set of self-pMHCs (blue circles), and this ability is represented by an edge between the clonotype and the self-pMHC.We partition all peripheral naive T cell clonotypes as follows: clonotypes in the periphery are in C , if they are explicitly modelled, or in M , if

Figure 1.
Tripartite network of TCR-self-pMHC recognition.Each blue circle represents a self-pMHC and each green circle a T cell clonotype.A clonotype is explicitly modelled if it is in C , or part of the periphery but not explicitly modelled, if it is in M .This implies C ∩ M = ∅ .An edge between a blue and a green circle represents the ability of that T cell clonotype to receive stimulus from the self-pMHC.For a given self-pMHC, q ∈ Q , the set of clonotypes it can stimulate in C is C q , and for a given clonotype, i ∈ C , the set of self-pMHCs that can stimulate it is Q i .they are not explicitly modelled.We note that this definition implies C ∩ M = ∅ .Each clonotype i ∈ C has an associated set of self-pMHCs that stimulate it, denoted by Q i (see Fig. 1).
We assume all self-pMHCs provide the same (constant) rate of homeostatic proliferation stimulus, and denote it by γ 8,17 .Then, we can write the total homeostatic proliferation stimulus each naive T cell of clonotype i receives if the system of η clonotypes is in state n = (n 1 , . . ., n η ) ∈ S as follows where n q is the total number of naive T cells in the periphery ( C ∪ M ) that are stimulated by self-pMHC q (see Fig. 1).With this stimulus rate we can now define the birth rate of clonotype i in state n ∈ S , as the transi- tion rate from state n to state n (+i) :=(n 1 , . . ., n i−1 , n i + 1, n i+1 , . . ., n η ) , which is given by Similarly, the transition rate from state n ∈ S to state n (−i) :=(n i , . . ., n i−1 , n i − 1, n i+1 , . . ., n η ) is the death rate of clonotype i, and it is given by If n i = 0 for any clonotype i, with 1 ≤ i ≤ η , its birth and death rates will both be zero, in agreement with our assumption on thymic production.Thus, we observe that the set of states with at least one entry equal to zero, A = (n 1 , . . ., n η ) ∈ S : n i = 0 for any i , with 1 ≤ i ≤ η , is an absorbing set, and the state (0, . . ., 0) is an absorbing state representing the extinction of all η clonotypes 27,28 .
Let us consider two states, n = (n 1 , . . ., n η ) and m = (m 1 , . . ., m η ) in S .The transition probability from n to m in a small time interval, t , is defined as and in the limit t → 0 + , this transition probability satisfies The clonotypes in C are explicitly modelled by the process X , yet there are other clonotypes in the naive T cell repertoire which can also receive stimuli from self-pMHCs in Q , but which do not overlap significantly with the clonotypes in C .These clonotypes are contributing to the competition for stimuli as a "sink", in the sense that they are taking a portion of the stimuli, but their population dynamics is not explicitly modelled.We denote the set of these clonotypes by M and define its cardinality to be |M| = M .We define C q as the set of clonotypes in C which can receive stimuli from self-pMHC q ∈ Q , and separate n q into the number of cells in C q , which receive stimuli from self-pMHC q, and the number of cells in M , which receive stimuli from self-pMHC q.Since we do not explicitly model the cells in M , we will assume these populations are homeostatically established 8 , and thus, have a constant size (see Eq. (A.9) in the Supplementary Material).

Approximation of the transition rates
In what follows we illustrate the definition of the transition rates in the case η = 3 ; that is, the case where only three clonotypes are explicitly modelled.For the general formulation of these rates, details have been provided in Appendix A (see Supplementary Material).We focus on this specific case for two main reasons.First, when considering the generalisation of a two-dimensional competition model, a three-dimensional model is the simplest case to be studied.Furthermore, as shown in Appendices E, F, and G in the Supplementary Material, after a finite transient time a clonotype will become extinct, and the competition process can be modelled using a process with dimension one fewer than the original.Second, the computational cost of calculating quantities of interest for a model with more than three clonotypes is presently too high, making parameter exploration, and analysis of such scenarios intractable.
We first write Eq. (1) making use of the following definition.Let n iq , for i = 1, 2, 3 and q ∈ Q i , be the number of cells not of clonotype i that can receive stimulus from self-pMHC q; that is, n iq = n q − n i for i = 1, 2, 3 , so that Eq. (1) becomes By writing the per-cell birth rate in this manner it is easy to see that it depends not only on clonotype i, but on all other clonotypes which compete for stimuli from self-pMHCs in Q i .In addition, we can subdivide the set of self-pMHCs that can stimulate clonotype i into sets Q k ij , where j denotes the number of other clonotypes that the stimulus is shared with (including the case j = 0 , where the stimulus is not shared with other clonotypes), and k indexes the different possible choices of j clonotypes (see Eq. (A.2) and Eq.(A.3) in the Supplementary Material).Then, using these subsets we can write Eq. ( 4) for clonotype 1 as follows (1) (3) otherwise.
(4) (A.12) in the Supplementary Material).By doing so we introduce two new parameters: the mean niche overlap ν k ij , which represents the expected number of clonotypes in M that receive stimulus from self-pMHCs in Q k ij , and n , the average number of cells for each clonotype in M .Then, the per-cell birth rate of clonotype 1 can be approximated as where ϕ 1 is the total stimulus available to clonotype 1, and p k 1,j is the probability of a self-pMHC in We note that given these definitions of ϕ i and p k ij , we have the following relations (see Eq. (A.15) in the Supplementary Material): Finally, the per-cell birth rate shown in Eq. ( 6) can be simplified for two limiting cases.The first one when ν k ij ≪ 1 for all ν k ij , called the "hard niche" limit, which is characterised by weak competition with clonotypes not explicitly modelled.In this case the birth rate for clonotype 1 simplifies to The second case, where ν k ij ≫ 1 for all ν k ij , called the "soft niche", in which there is significant competition with clonotypes not explicitly modelled.In this case the birth rate of clonotype 1 is approximated by

Quasi-stationary probability distribution
We now want to study the behaviour of η competing clonotypes before the first extinction event occurs.In order to do so, we introduce the quasi-stationary probability distribution (QSD), which describes the late time behaviour of the process conditioned on non-extinction 23,29,30 .We introduce p n (t) , the probability that at time t the competition process X is in state n , given that it started in state n 0 , i.e., ( 5) (1) (1) Let us consider the absorbing set A = (n 1 , . . ., n η ) : n i = 0 for any i and denote by p A (t) the probability that at time t the process is not in A .Now, we define the probability that the process is in state n ∈ S \ A at time t given that absorption into A has not occurred yet, as follows Finding an analytical solution for the probabilities defined in Eq. ( 11) is in general not possible (see Appendix B in the Supplementary Material), and thus, the QSD will be numerically approximated.We discuss two useful approximations in the following section, which were proposed in Refs. 23,24.

Approximation of the QSD: two auxiliary processes
We approximate the QSD making use of two auxiliary competition processes 24 .In the first approximation we will consider the multi-variate Markov process X (1) = X (1) n , is given by Eq. (A.14) in the Supplementary Material, and its death rate by This process, so defined, makes the extinction of clonotypes impossible.Thus, the state space of X (1) is the set of states where no extinction has occurred The second process we will consider is X (2) = X (2) η (t) : t ≥ 0 , where the birth rates, 2,(i) n , are the same as those for X and X (1) , and we consider an immortal cell present in each clonotype; that is, the death rates are given by and the state space of X (2) is also A 0 .
To approximate the QSD for a multi-variate competition system, we will calculate the stationary probability distribution of the two auxiliary processes defined above 23,24 .To this end we first separate A 0 into subsets, which we will call levels, as follows for k = η, η + 1, η + 2, . . . .Then, these levels can be ordered as follows and the states in each level can be ordered using the colexico-graphical order, sometimes called reverse lexicographical order 31 (see Appendix C in the Supplementary Material).Note that we start at L 0 (η) , since any state with fewer than η cells in total does not belong to A 0 by definition.Using a combinatorial argument 32 , we find that We now introduce the plane η i=1 n i = N as a reflecting boundary on our state space.This means that we only consider states which have at most N cells in total.In practice, this truncation value can be chosen so that the probability of exceeding a total number of cells, N, in the population is negligible (see "Study of clonal extinction").Then, the number of states in A 0 is Finally, the stationary probability distributions of X (1) and X (2) can be computed with a linear level-reduction algorithm 33 (see Appendix D in the Supplementary Material), an outline of which is given in Algorithm D.1 in the Supplementary Material.

Study of clonal extinction
We next studied the behaviour of the competition process X and its extinction events with the use of stochastic descriptors and first step arguments.We first show that our system will reach the absorbing state with probability 1 (see Appendix E in the Supplementary Material); that is, all clonotypes are guaranteed to become extinct for late enough times.Then, we show that total extinction is not only guaranteed, but it occurs in finite time (see Appendix F in the Supplementary Material).However, it is important to note that this finite time could be on (11) Vol:.( 1234567890) Since the mean time to the extinction of all clonotypes is finite, the mean time to the first extinction event is also finite.We found an expression for the mean time to the first extinction event (see Appendix G in the Supplementary Material), to understand the timescales at which a model with η clonotypes can be simplified to a model with η − 1 clonotypes.Following this, we studied the distribution of clonal sizes at the time of the first extinction event to study changes in the population of surviving clonotypes (see Appendix H in the Supplementary Material).In calculating this probability distribution, we obtain two important descriptors of the model.First, the probability distribution of clonal sizes at the time of the first extinction event U , which we can separate into the following probabilities and second, the probabilities of each clonotype being the first to become extinct given an initial state n Finally, we studied the distribution of divisions before a clonotype becomes extinct given an initial state n (see Appendix J in the Supplementary Material); that is, we considered which can be used as a measure of proliferation of a T cell clonal family before its extinction.

Results
We now focus our study to the dynamics of two established clonotypes, which compete for self-pMHC stimuli with a third, new, clonotype that has just exited the thymus.Thus, the two clonotypes will be in a state defined by the mean of their two-dimensional QSD (and in the absence of the newly arrived clonotype).In this case, there are only two p k ij probabilities for each clonotype.The probability p i,0 that a self-pMHC is recognised only by clonotype i, and the probability p i,1 that a self-pMHC is recognised by both clonotypes.Thus, by Eq. (A. 15) in the Supplementary Material, we only need to choose a value for one of the probabilities to determine the rest.
We will denote the new clonotype as clonotype 1, and the established ones as clonotype 2 and clonotype 3, respectively.We consider all three clonotypes to have the same total homeostatic stimulus; that is, we assume ϕ 1 = ϕ 2 = ϕ 3 = 10 cells • year −120 .Before the arrival of clonotype 1 we hypothesise Q 2 ∩ Q 3 to be in the soft niche, and the remaining self-pMHCs in the hard niche.We remind the reader that "hard niche" implies little competition with clonotypes not explicitly modelled, and "soft niche" that self-pMHCs are significantly competed for with clonotypes not explicitly modelled.Figure 2A shows the competition scenario described for the two established clonotypes, where the size of each Q i circle represents the magnitude of ϕ i , and the colour of each region represents the magnitude of ν k ij , with a darker colour indicating a greater value.In this competition scenario the values of p 2,1 and ν 2,1 were chosen to be 1/3 and 1, respectively.Making use of the method described in Appendix G (see Supplementary Material), we calculated the mean time to extinction for all initial states with at most 10 2 total cells and found them to range between 80 to 125 years.Given these long extinction times it was appropriate to approximate the QSD of this competition process.In Fig. 2B we plot the marginal distribution for clonotype 2. We show this distribution for only one of the clonotypes, since the competition between them is symmetric, and thus clonotypes 2 and 3 are identical before the introduction of clonotype 1.We observe that for the competition scenario considered, with most of the self-pMHCs in the hard niche, the approximation X (1) better describes the behaviour of the system (see Gillespie simulations in Fig. 2B).However, we note that shifting X (2) by one cell results again in a good approximation.This is to be expected for these small population sizes and for the hypothesis of an immortal cell in this process.
Once clonotype 1 enters the periphery we will consider four different competition scenarios across a spectrum of symmetries, ranging from full symmetry to complete asymmetry.In the first scenario, Fig. 3A, we assume all clonotypes to be competing symmetrically; that is, all one-on-one competitions have the same probability.For the second scenario, Fig. 3B, we hypothesise symmetric competition between clonotype 1 and clonotypes 2 and 3, but these competitions are greater than the competition between clonotypes 2 and 3, leaving the new clonotype at a disadvantage.Our third scenario, Fig. 3C, has clonotype 1 competing more for homeostatic stimuli with clonotype 3 and less with clonotype 2, giving clonotype 2 an advantage.Our final scenario, Fig. 3D, represents the case of extreme asymmetry in which clonotypes 1 and 3 compete completely for stimuli.In all scenarios, we consider self-pMHCs recognised by more than one clonotype to be in the soft niche, with the value of the mean niche overlap increasing as the number of clonotypes increases.For the self-pMHCs recognised only by clonotype 1, we consider both hard and soft niches.
The diagrams in Fig. 3 were used as a guide to choose the probabilities shown in Table 1.Since we are considering three different clonotypes, we can use Eq. ( 7) to determine all the probabilities with only the four values given in Table 1.In the case of a single clonotype with no direct competitors ( η = 1 ), the mean time to extinction for mean niche overlap values greater than 10 is negligible 8 .Therefore, the values chosen for the mean niche overlap in our competition scenarios were: 1 for self-pMHCs recognised by two clonotypes, and 10 for self-pMHCs recognised by three clonotypes.Since we are considering clonotypes 2 and 3 to be homeostatically established, (B) Symmetric competition between clonotype 1 and clonotypes 2 and 3, but greater than the competition between clonotypes 2 and 3.   their initial state will be the mean of the QSD for the competition between them rounded to the nearest integer, which with the chosen parameters is the state (8, 8) (see Fig. 2B).

Clonal distributions at the first extinction event
Using the method described in Appendix H in the Supplementary Material, we calculate the distribution of clonal sizes at the time of the first extinction event for the four different scenarios, and with the new clonotype in the hard or soft niche.Figure 4 shows the distribution of clonal sizes with an initial state, n 0 = (4, 8, 8) , as well as the probability for each clonotype to be the first to become extinct, U i n 0 .A triangle represents the initial state considered, n 0 = (4, 8, 8) , and a diamond represents the mean of the resulting distribution of clonal sizes at the time of the first extinction event.
Our results indicate that the probability of the new clonotype (clonotype 1) being the first to become extinct drastically increases when we compare the hard and soft niche cases in all the scenarios except (d).This shows that if the clonotype introduced is in the soft niche, its probability of extinction is greater, putting it at a large disadvantage against the other two homeostatically established clonotypes.In scenario (d) however, we see that not only does the probability of extinction of clonotype 1 change, but both U 1 n 0 and U 3 n 0 see a marginal increase.This different behaviour can be explained given that for (d) not only a new clonotype is being introduced in the system, but clonotype 3 changes from hard to soft niche, since this is a quality of the set of self-pMHCs and not of the clonotype itself.This implies that while in scenario (d) clonotype 1 sees no significant change in its extinction probability between the hard and soft niche cases, if it exits the thymus with more clonotypes that will also compete with it, this will reduce the advantage of clonotypes it competes with.By comparing the probabilities of extinction for clonotype 1 in the four different scenarios, we observe that in the hard niche case the most favourable scenario for its survival is scenario (a), of symmetric competition.This is the scenario in which it has less competition with the other two clonotypes overall.On the other hand, when we consider the soft niche case, we see that the most favourable scenario is scenario (d), in which the new clonotype competes completely for stimuli with an established clonotype.One likely explanation of this seemingly counter-intuitive behaviour is that, in scenario (d) not only is clonotype 1 exerting pressure on the established clonotype, but other clonotypes not explicitly modelled are doing so too, implying that the competitive pressure from clonotypes in M is shared with between clonotypes 1 and 3.
If we focus on the cases when clonotype 1 is not the first to become extinct, we see that in scenario (a) we expect its size to have rebounded into an established state in which both surviving clonotypes have fewer cells than the mean of the QSD of two competing clonotypes (before clonotype 1 was introduced).As a result the new clonotype expanded, while the established one contracted.In the soft niche we see that the size of the new clonotype does not bounce back to a homeostatic state, but instead both surviving clonotypes see a reduction in their number of cells.In scenario (b) for the hard niche case, we again see a move to a homeostatic state.However, there is little change in the size of the new clonotype, and a decrease in the size of the established clonotype to match the population of the surviving one.In the soft niche for this scenario we see a similar behaviour coupled with a reduction in the population size of clonotype 1.In scenario (c) we see a break from the symmetry observed between U 2 n 0 and U 3 n 0 in the previous scenarios, since the competition considered is no longer symmetric.If clonotype 2 is the first to become extinct, we expect the population of clonotype 3 (which has a greater competition with clonotype 1) to decrease until it matches that of clonotype 1.On the other hand, if clonotype 3 becomes extinct first we see very little change in the population of clonotype 2 (which has a lower competition with clonotype 1), and only a minor change in the size of clonotype 1. Lastly, in scenario (d) we see an interesting change in the shape of the clonal size distribution when clonotype 2 becomes extinct first.Since clonotypes 1 and 3 directly compete for all stimuli, they behave as a single population.Thus, the distribution is no longer centred around a point, but around a line where the sum of both populations is constant.However given the uneven initial state, the distribution has more density on the end that has clonotype 3 surviving with more cells than clonotype 1.If clonotype 3 is the first to become extinct, we see the same behaviour observed before where the population of the larger surviving clonotype decreases to match that of the new clonotype.In the soft niche we see an interesting behaviour when clonotype 2 is the first to become extinct.Since both remaining Table 1.Parameters for the competition scenarios shown in Fig. 3.The base value considered for ϕ is 10, and we also consider the following values: ϕ = 1, 10 2 .clonotypes are in the soft niche, the distribution has most of its density accumulated around state (1, 1), implying that even if these two clonotypes survive they can very easily become extinct due to their small population sizes.
We now turn our focus to the U 1 distributions.We can see that even when the new clonotype is the first to become extinct, it has a negative effect on the populations of the established clonotypes, reducing their average sizes in all cases.To better understand the effect the new clonotype has on the homeostatically established clonotypes, we calculate the probability for each clonotype to become extinct as a function of the initial number of cells in the new clonotype and plotted them in Fig. 5.
The first thing we observe is that when comparing the hard and soft niche cases (clonotype 1) in each scenario, the probability of clonotype 1 becoming extinct first is always higher in the soft niche than in the hard niche case.For clonotypes 2 and 3 we see the opposite behaviour, with these clonotypes being more likely to become extinct first in the hard niche case.The only exception to this (by a minimal margin) is clonotype 3 in scenario (d), since in this case clonotype 3 has the added disadvantage of being in the soft niche, which increases its probability to become extinct.Another property we observe of the probability of extinction is that it very quickly becomes saturated; that is, it shows little sensitivity to changes in the number of initial cells in the new clonotype after a certain value (or threshold).The threshold level depends on the competition scenario considered.This implies that even a new clonotype with a small initial population can perturb the two-dimensional system and make one of the established clonotypes more likely to become extinct first.The largest difference between hard and soft niche cases is seen in scenario (a) for clonotype 1, where the probability of clonotype 1 being the first to become extinct is almost halved when comparing the soft to the hard niche.This is due to the fact that in this case the value of p 1 1,1 is highest, meaning that the proportion of self-pMHCs shared with clonotypes 2 and 3 is the lowest.Thus, a change in the mean niche overlap has a very strong effect on clonotype 1, since it changes from a scenario of low competition to one of complete competition.This is the same reason why we see such little change in scenario (d), since clonotype 1 is already competing for all of its stimuli and a change in the mean niche overlap has a much weaker effect.From this figure we also learn that in the hard niche case there are different optimal competition scenarios for the survival of the new clonotype.For initial numbers of cells fewer than six, we find the lowest probability of extinction in the symmetric competition scenario (a).For values between 6 and 30 the optimal competition is the asymmetrical competition scenario (b).Finally, for values above 30 complete competition with an established clonotype gives clonotype 1 the lowest probability of becoming extinct first.Interestingly, this behaviour is completely lost in the soft niche case, where we see that there is only one optimal competition scenario for all possible initial numbers of cells, namely complete competition with another clonotype (scenario (d)).
We also considered the effect of increasing and decreasing the total amount of stimulus available, ϕ , by an order of magnitude.In Fig. 6 we have plotted the probabilities of extinction, U i n 0 , for the case when ϕ = 1, 10 2 .For ϕ = 1 we see that, due to the scarcity of stimulus, there is little effect from the competition.Yet in the hard niche we still see smaller probabilities of extinction for clonotype 1, and higher probabilities for the established clonotypes.The strongest effect on the extinction probabilities in this case comes from the initial number of cells.However, in contrast to Fig. 5 the probabilities do not become saturated as quickly, meaning that in this case each cell of clonotype 1 has a weaker effect on the sizes of the established clonotypes.For ϕ = 10 2 we see the opposite behaviour, where the probabilities become saturated so quickly that they appear as almost completely horizontal lines, with the notable exception of scenario (d).The first thing we note is that for clonotype 2 all the probabilities are negligible, except for scenario (a) in the hard niche case.In this case, given the overabundance of stimulus, the competition between clonotypes has no effect and all clonotypes are equally likely to be the first to become extinct.In scenario (b), for both hard and soft niche cases, the increased competition causes the extinction probability to accumulate on clonotype 1.The asymmetric competition of scenario (c) accumulates Figure 5. Probability of extinction for each clonotype, U i n 0 , in the four scenarios with hard and soft niche cases (for clonotype 1) as a function of the initial number of cells in clonotype 1, calculated using the method described in Appendix H in the Supplementary Material.The initial number of cells of the other two clonotypes is the mean of the QSD of their two-dimensional competition process, namely (n 2 , n 3 ) = (8, 8).
the hard niche case, scenario (d) has the lowest median and the most narrow distribution for the new clonotype overall, followed by the distributions of scenarios (b), (c), and (a), in that order.This agrees with the probabilities of extinction for the initial state n = (4, 8, 8) in Fig. 5, where we see that these probabilities decrease in the same order.In the soft niche case we can see a direct relation between the probabilities of extinction and distribution of divisions for scenarios (a), (b), and (c), but not for scenario (d).This can be justified since the probabilities of extinction are changing due to an increased chance of clonotype 3 becoming extinct and not a direct change to the extinction of the new clonotype.

Discussion
Maintaining the diversity of the T cell repertoire is essential for the immune system to mount a strong and effective immune response [35][36][37][38] .Promoting survival of significantly different clonotypes maximises this diversity by allowing TCRs with similar self-pMHC recognition profiles to become extinct.However, it has been observed that several clonotypes often overlap on their recognition profile (e.g. 25,26,39) as a compromise between TCR diversity and coverage over the space of foreign antigens.This motivates the multi-variate representation we have developed in this manuscript, which extends and generalises that presented in 17 to three or more clonotypes.www.nature.com/scientificreports/ We showed that the cellular division rate of competing clonotypes decreases as the overlap in their recognition profiles increases.That is, we see the effects of competitive exclusion as the distribution of the number of divisions shows a decrease in its mode (the peak moves to the left), when there is an increase in the competition for self-pMHCs.Furthermore, we mathematically showed that extinction of all clonotypes is certain; that is, given sufficient time (which can be larger than the biological timescales) the populations of all clonotypes will become extinct.The mean time to this extinction event is bounded, and it depends on the mean number of cells per clonotype.A feature of the proposed competition process, prior to the first extinction event, is that the system is driven to a state where all clones have low numbers of cells, with the specific number of cells depending on their recognition overlap.This agrees with the biological assumption that in a homeostatic state each clonotype consists of only a few cells, with the exact number varying from clonotype to clonotype 20,40 .
Our multi-variate competition model shows that the introduction of a new clonotype to the periphery perturbs the homeostatically established clonotypes.We see this in the decreased mean number of cells after the first extinction event when compared to the initial state of the process.Even in the scenario with the least competition for stimuli (scenario (a)) we see a perturbation of the established clonotypes.More than this, the probability of extinction as a function of the initial number of cells in the new clonotype very quickly becomes saturated (see Fig. 5).This implies that even a clonotype with a low number of initial cells has great potential to perturb already established clonotypes.
From the distribution of divisions before extinction, we can say that in the soft niche case we always expect the new clonotype to experience very little proliferation before becoming extinct.Even in scenario (a), where competition with other modelled clonotypes is the lowest, we see a distribution of divisions that is very narrow and centred around a small value.This major change outlines an important feature of the soft niche; even when considering a mean niche overlap of only ten clonotypes, the probability of extinction is greatly increased and the distribution of divisions is made more narrow and moved to the left.This major change in behaviour between the hard and soft niche cases can be interpreted in two ways: (i) for the soft niche assumption to be correct in the naive T cell repertoire, the mean niche overlap value must be a small number of clonotypes ( 1 < ν k ij < 10 ), otherwise a new clonotype entering the periphery would have a very low probability of becoming established, and the naive T cell repertoire would become mostly static; (ii) if the mean niche overlap is not small, then T cells in the soft niche must have a very fast turnover, and make up for most of the thymic output to counteract this low probability of proliferation and establishment.These two interpretations are made under the assumption that the thymic output is mostly homogeneous; that is, we assume that most of the cells exiting the thymus are either part of the hard or soft niche.By relaxing this assumption and considering a heterogeneous thymic output, instead of a homogenous one, we can think of the naive T cell repertoire as being divided in two sets: one set comprised of clonotypes in the hard niche, which is deeply established (with low extinction probabilities) and is expected to remain mostly constant through life, and another set of clonotypes in the soft niche, which is constantly changing and is more dynamic through life (as clonotypes appear from the thymus and become extinct due to clonal competition).This type of heterogeneity would allow for the maintenance of a T cell repertoire that has the capability of constantly producing new clonotypes (in the soft niche), while maintaining other clonotypes (in the hard niche) throughout life.
So far we have only discussed the possible effects of a distribution of soft and hard niche clonotypes exiting the thymus, assumed to be constant throughout the ageing process 37,41,42 .However, given the known changes in thymic output through life, such as thymic involution 43,44 , and other age-related changes 45 , it is plausible that the composition of thymic emigrants (in terms of hard and soft niche, or clonal size) changes as well during the lifespan of a host.If we consider thus, not a constant mixture of hard and soft niche clonotypes exiting the thymus, but a mixture that evolves from mostly hard niche clonotypes during fetal stages to mostly soft niche clonotypes during adulthood, then our model predicts the early establishment of clonotypes that exited the thymus early in life, and then a declining supply of mostly short-lived clonotypes later in life.This decline in the production of soft niche clonotypes could be justified by the fact that during the initial development of the T cell repertoire, we naturally expect there to be fewer clonotypes in the periphery, and therefore smaller values of the mean niche overlap ν k ij .Furthermore, this behaviour is compatible with the analysis by Gaimann et al. 46 , in which T cell clonotype sizes were found to follow a power-law distribution, where clonotypes generated during the early fetal stages (characterised by no nucleotide insertions during V(D)J recombination 47 ) were found to be the most enriched in the periphery.A dynamical analysis, making use of the competition model proposed here, of clonotype diversity, sizes and niches, with parameters corresponding to a human or murine host, may be feasible.
= P the process is at state m at the time of the first extinction event X(0) = n, and clonotype i became extinct , (

Figure 2 .
Figure 2. (A)Competition scenario considered for the two established clonotypes, where they compete for 1/3 of their stimulus.The shaded region represents the subset of self-pMHCs considered to be in the soft niche, with a mean niche overlap value ν 2,1 = ν 3,1 = 1 .(B) Marginal distribution of the QSD for clonotype 2, approximated using the processes X (1) and X(2) defined in "Approximation of the QSD: two auxiliary processes".

Figure 3 .
Figure 3. Competition scenarios when a new clonotype is introduced in a two-dimensional clonotype system.Shaded areas represent sets of self-pMHCs that are in the soft niche, with a darker shade meaning a lager value of the mean niche overlap ( ν k ij ).Cross-hatched areas represent regions where we consider both the soft and hard niche cases. https://doi.org/10.1038/s41598-023-46637-4www.nature.com/scientificreports/

Figure 4 .
Figure 4. Distributions of clonal sizes at the time of the first extinction event (each column, U i for i = 1, 2, 3 , represents the extinction of clonotype i in the hard and soft niche cases (clonotype 1), green and blue, respectively, with initial state n 0 = (4, 8, 8) using the method described in Appendix H in the Supplementary Material).The fourth column shows the probability for each clonotype to be the first becoming extinct, U i n 0 .Each panel (a, b, c and d) corresponds to one of the competition scenarios shown in Fig. 3.A triangle represents the initial state and a diamond indicates the mean of the resulting distribution of clonal sizes at the time of the first extinction event.
Probability distribution of the number of divisions before extinction for the initial state n 0 = (4, 8, 8) in the hard niche case (for clonotype 1).Probability distribution of the number of divisions before extinction for the initial state n 0 = (4, 8, 8) in the soft niche case (for clonotype 1).

Figure 7 .
Figure 7. Probability distribution of the number of divisions before extinction for the initial state n 0 = (4, 8, 8) calculated using the method described in Appendix J (see Supplementary Material).(A) shows the distributions when clonotype 1 is in the hard niche case and (B) in the soft niche case. 3.